stochastic nested problem
Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems
Stochastic nested optimization, including stochastic compositional, min-max, and bilevel optimization, is gaining popularity in many machine learning applications. While the three problems share a nested structure, existing works often treat them separately, thus developing problem-specific algorithms and analyses. Among various exciting developments, simple SGD-type updates (potentially on multiple variables) are still prevalent in solving this class of nested problems, but they are believed to have a slower convergence rate than non-nested problems.
- Europe > Austria > Vienna (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
- (7 more...)
Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems
Stochastic nested optimization, including stochastic compositional, min-max, and bilevel optimization, is gaining popularity in many machine learning applications. While the three problems share a nested structure, existing works often treat them separately, thus developing problem-specific algorithms and analyses. Among various exciting developments, simple SGD-type updates (potentially on multiple variables) are still prevalent in solving this class of nested problems, but they are believed to have a slower convergence rate than non-nested problems. By leveraging the hidden smoothness of the problem, this paper presents a tighter analysis of ALSET for stochastic nested problems. Under the new analysis, to achieve an \epsilon -stationary point of the nested problem, it requires {\cal O}(\epsilon {-2}) samples in total.